-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Have tidy ensure that we document all unsafe
blocks in libcore
#63793
The head ref may contain hidden characters: "\u{1F9F9}"
Conversation
r? @KodrAus (rust_highfive has picked a reviewer for you, use r? to override) |
The job Click to expand the log.
I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact |
are the six now-safe intrinsics ( |
No, I just did them here because otherwise I'd have had to add |
The job Click to expand the log.
I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact |
I'm not personally a huge fan of style lints like this in the sense that it just forces unhelpful boilerplate documentation to be written in many circumstances. Most of the safety comments I'm seeing sort of just trivially reference the line above it so I'm not really sure how a comment actually bolsters understanding of the function because it's all being read anyway. There's also other comments like "SAFETY: transmuting a sequence of Overall I think I would prefer to not have a lint that guarantees all |
It's no secret I think this PR is sorely needed.
I don't think each and every safety comment needs to be super helpful and if there are trivially true cases then those can just say "trivially safe". Often however, it's not as trivially true as one experienced reviewer, e.g., @alexcrichton, who often looks at
This suggests that it's in fact not as trivial as one might think and that more elaboration is needed rather than not having a comment at all. It's also not the trivial cases that we wish to catch with these comments but it's also not possible to distinguish them without far more complicated semantic analysis (proofs in e.g. Coq). I think the cost of a bit of boilerplate in trivial cases is well worth it if we get to have commentary on more complicated cases. One might say that this problem should be solved with good reviews. To me however, it is clear that reviews have not been satisfactory thus far. Also, from a language team perspective, I find it really important to know what assumptions the standard library makes on our operational semantics. |
@rfcbot fcp close I'm going to propse that we close this. Other members of the libs team can of course disagree! I explained some reasoning above. @Centril I personally think that annoying lints can be actively harmful because they encourage the wrong coding patterns and give a false sense of security or correctness. An No one writing these |
Team member @alexcrichton has proposed to close this. The next step is review by the rest of the tagged team members: No concerns currently listed. Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up! See this document for info about what commands tagged team members can give me. |
Do you think documenting why an
This feels to me like "because we cannot have perfect security or correctness we should not try". I'd like to point to @RalfJung's comments 1 and 2 which feel relevant.
Comments are not intended as a proof of correctness. However, they can at least be read by humans during review and after the fact. I believe one can use good judgement in the length one gives to unsafe comments. No one has suggested that one should "explain everything".
I can understand that it might be a non-trivial amount for you in libstd's operating system related code. However, in key parts like libcore and liballoc I think it's well worth the time investment.
No one has suggested that you read through the entire nomicon each time you write
I did not say "what assumptions the libs team is making". I think most of the code added to the standard library are in fact added by people outside the libs team. I referred to assumptions of the standard library as a whole. This is useful information when doing future language design or just for internal consistency of across the standard library.
Perhaps we interpret "assumptions" differently, but I would hope that you at least consider what invariants you must uphold in an
"Getting shit done"-ism is not an excuse nor in opposition to being deliberate and careful with |
This does not encourage documentation of the safety of |
Strictly speaking, tidy has a way to opt-out per file. Overall it is true however that it mandates documentation, but I think that's a good idea as well. Otherwise, people will oftentimes not do it when it is necessary and I think that's a problem. |
I don't think you'll find anyone on any Rust team who actively says The amount of work to document |
I hope you are right but I think that's a low bar to set.
Elaboration re. what you mean by "wrong coding patterns" would be helpful then as your argument largely felt like "this is security theater".
Because once we have done so, without lints, the documentation coverage would regress over time. If tidy yells at you, even if we have a local override like
I think those concerns have not been fleshed out. I replied re. the specifics of alignment and whatnot and would appreciate fewer generalities in return. Let's be specific and concrete about what those areas are where problems and hindrances occur (like... are there specific file paths you'd not like to lint, like in the OS / FFI code?). Perhaps we can find a middle ground. |
After looking through the diff here, 90+% of them look like "SAFETY: we performed the check that's one line above this unsafe block", or "SAFETY: it's okay to do this". These read to me like line noise - they add no real additional information to the context. It reminds me of Javadoc requirements in some of my college homework assignments where you end up with hundreds of lines of Every PR is code reviewed by another person. It's their job to make sure that the code is sufficiently comprehensible. Sometimes there are tricky invariants or behavior that require a couple of lines of explanation in comments. Sometimes those happen to be oriented around unsafe blocks, but they often aren't. If someone comes along and is confused by some piece of code, that may result in someone making a PR adding some extra commentary on it, but again, that doesn't have anything to do with what kind of expression that code is. |
I hadn't articulated that before, but yeah this sort of feels like "safety theater" where we're trying to give ourselves a sense of security or a sense of "oh here's everything we have to account for", when in fact a strict lint such as this is largely only going to manufacture comments like @sfackler was mentioning. I, too, have seen endless instances of I don't feel this is the time to get sort of more specific than that, I'm not trying to critize @oli-obk's work here. I'm criticizing the idea that we should have this lint. I think the appropriate way to go about this (if we were to go about this) is to start down the road of documenting I also very much agree with @sfackler that the
FWIW this still feels very disingenuous, "I hope you are right" again seems like it's attempting to cast me, personally, in this ominous light of "oh look at this guy who refuses to document things and doesn't want anyone else to document anything". I get the impression you really think we should actively not document |
I wrote this lint with the impression that most unsafe blocks can have meaningful docs. Then I used it and as obvious, that is not the case. It can be partially resolved by expecting either the function docs or the module docs (of any of the surrounding modules) contain an explanatory safety comment. This would solve the "read the docs" comments that I added. Since this scheme allows incrementally expanding its scope, I was hoping to iterate on it to end up at a good scheme. Since very few unsafe blocks are documented so far, I'd be fine with first starting to do this without a lint and then revisiting once we're nearing full documentation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this @oli-obk.
I read all the "SAFETY:" comments added in this PR and there aren't any that I would mind having in the codebase. I would be happy with merging this.
I find this comment from @Centril compelling:
It's also not the trivial cases that we wish to catch with these comments. [...] I think the cost of a bit of boilerplate in trivial cases is well worth it if we get to have commentary on more complicated cases. One might say that this problem should be solved with good reviews. To me however, it is clear that reviews have not been satisfactory thus far.
In response to @alexcrichton's and @sfackler's similar concerns,
Most of the safety comments I'm seeing sort of just trivially reference the line above it so I'm not really sure how a comment actually bolsters understanding of the function because it's all being read anyway.
After looking through the diff here, 90+% of them look like "SAFETY: we performed the check that's one line above this unsafe block", or "SAFETY: it's okay to do this". These read to me like line noise - they add no real additional information to the context.
I understand this reaction but it's meaningful to call out when the check on the previous line is the only consideration in an unsafe block being okay, as opposed to some other ambient invariant that comes in from elsewhere. We could reserve comments for only those unsafe blocks with a requirement more complicated than the previous line, but when touching nearby code it's a big help to be told when the safety reasoning is all just local.
☔ The latest upstream changes (presumably #64535) made this pull request unmergeable. Please resolve the merge conflicts. |
I am fully in favor of this PR! I am sure @Shnatsel's safety dance would benefit a lot from such conventions being more widely applied in the ecosystem.
I think it is extremely informative that the author of an unsafe block thought that this unsafe block is justified by just locally checking the line(s) above. That makes review just so much easier: if, after a local check, I am not convinced that this unsafe block is correct, I don't need to start a complex analysis of the invariants in this file (because maybe the block actually is correct but the argument is more complex). I know immediately that the author of the code made a mistake, saving a lot of time. The comment saying "correct because we just checked X" takes very little time to write, and documents an important piece of information that makes audits a lot simpler. I think that's a good comment.
So the correctness argument is more subtle than @oli-obk thought; all the more reason to write it down, I'd say.
I assume that you have some argument in your head for why this unsafe block will not cause UB. Sure, the argument does not have to be a full-blown Coq proof, but my expectation is that people don't just write unsafe and hope that it'll work out. Given that, all we ask is that this -- informal, possibly incomplete -- argument be written down. The argument has already been made by the programmer to themselves; putting it down in writing (a) helps them clarify their own thinking, and (b) helps others by not making them re-do the same argument again from scratch.
Nobody is assuming bad intentions here. But how are we supposed to figure out guidelines if we do not even know the kind of reasoning y'all are using to justify the use of unsafe operations to yourself? The PR does not ask that you quote a UCG document for every piece of reasoning, it asks that you give a brief hint for why you think this is or should be correct. So, beyond the benefits listed above, such comments actually are an enormously useful resource to the UCG as they document what some very experience Rust programmers think are good rules for unsafe. If that does not align with what we think, we'd like to know so that we can start a discussion. But if your thinking is never written down, we won't notice the mismatch. Doing language design "in the dark", without real-world input, is really hard; such comments would go a long way to help us ground our efforts and make us aware of common and important unsafe idioms. As @Centril said, there might also be code where this is less useful, such as code interacting closely with OS/FFI interfaces. Let's not lint on that code and instead find areas where it can be more useful. But by sheer volume, that is a minority of the code in this repo, so we have lots of other code that could benefit a lot from such a lint.
Rust has a history of using tools to guide our process, as experience shows that a policy not checked by a tool is most likely not going to be followed very well. We have a large and diverse set of PR authors and reviewers, and at least I am not aware of a "checklist" that reviewers are supposed to follow. So, basically, any repo-wide effort not guarded by a tool is unlikely to actually work. This went far enough that when we had PRs that did some repo-wide language editing (full stops and capitalization and things like that), the general response was "those PRs are not worth it unless we have a tool that checks that these mistakes do not creep back in". I would say the same argument applies here: trying to document unsafe blocks is not worth it unless we have a tool that makes sure no new undocumented unsafe blocks can creep back in. Personally I feel like some checks tidy is doing (such as line lengths or number of lines or banning If we had a "reviewer checklist", a good first step might be to add a requirement of documentation to that checklist. But we don't, or rather, our checklist is called You are saying you are fine to encourage comments on unsafe blocks, but how, concretely, do you suggest we actually do that in a way that is effective? The way I view it,
@Centril and me have argued above for why that does add information. It severely limits the search space of someone auditing the code. It is also very helpful for the UCG to know that most unsafe blocks are "trivial"/"local". In fact, I'd even be in favor of having a special keyword for this case so that we can have statistics on this, but that might be asking too much.
Disagreed, for the reasons mentioned above.
Then the unsafe blocks could still use a marker either referencing a module-level invariant or not. The key important property of In that process, the hardest part is figuring out just how much information you need to argue for some unsafe block's correctness. Even a brief comment saying "local checks" vs "module-level invariant" goes a long way to declare the scope that needs to be considered here, making unsafe-centric review so much simpler. @Shnatsel (and maybe others from safety-dance), you have done quite a bit of review of other people's unsafe code; your input in this matter would be much appreciated. (I wrote this as I am catching up in this thread, and now realize many of these points have already been raised by @Centril. So I guess what I am saying is, I am in full agreement, and I think this lint would benefit both the quality of code in libcore, would greatly simplify unsafe-centric code audits, and would provide tons of useful input to the UCG and lang team.) |
Thanks @RalfJung & @dtolnay; excellent points! I'll make two further ones which I believe have not been raised prior:
|
This is ready for review again |
r? @dtolnay |
📌 Commit e28287b has been approved by |
⌛ Testing commit e28287b with merge d6d3560f88327c70c2d8a2244e2b7abc668bbd75... |
Have tidy ensure that we document all `unsafe` blocks in libcore cc @rust-lang/libs I documented a few and added ignore flags on the other files. We can incrementally document the files, but won't regress any files this way.
@bors retry rolled up. |
⌛ Testing commit e28287b with merge 35386f2dc586e83ed4d4a4d07cd325014f09232c... |
Have tidy ensure that we document all `unsafe` blocks in libcore cc @rust-lang/libs I documented a few and added ignore flags on the other files. We can incrementally document the files, but won't regress any files this way.
@bors retry rolled up. |
Rollup of 5 pull requests Successful merges: - #63793 (Have tidy ensure that we document all `unsafe` blocks in libcore) - #64696 ([rustdoc] add sub settings) - #65916 (syntax: move stuff around) - #66087 (Update some build-pass ui tests to use check-pass where applicable) - #66182 (invalid_value lint: fix help text) Failed merges: r? @ghost
Rollup of 5 pull requests Successful merges: - #63793 (Have tidy ensure that we document all `unsafe` blocks in libcore) - #64696 ([rustdoc] add sub settings) - #65916 (syntax: move stuff around) - #66087 (Update some build-pass ui tests to use check-pass where applicable) - #66182 (invalid_value lint: fix help text) Failed merges: r? @ghost
cc @rust-lang/libs
I documented a few and added ignore flags on the other files. We can incrementally document the files, but won't regress any files this way.